AITopics | minimal model

Collaborating Authors

minimal model

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

The Evolution of Statistical Induction Heads: In-Context Learning Markov Chains

Neural Information Processing SystemsFeb-15-2026, 22:13:34 GMT

Work done while visiting Harvard University.

artificial intelligence, machine learning, natural language, (16 more...)

Neural Information Processing Systems

Country:

North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
North America > United States > Pennsylvania (0.04)
North America > United States > California > Los Angeles County > Long Beach (0.04)
North America > Canada > Quebec > Montreal (0.04)

Genre:

Research Report > Experimental Study (0.93)
Research Report > New Finding (0.92)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.84)

Add feedback

2c6ae45a3e88aee548c0714fad7f8269-Paper.pdf

Neural Information Processing SystemsFeb-8-2026, 01:07:20 GMT

bifurcation, bifurcation diagram, gradient, (16 more...)

Neural Information Processing Systems

Country:

North America > United States > Virginia (0.04)
Europe > United Kingdom (0.04)
Europe > Hungary > Budapest > Budapest (0.04)
Europe > Belgium > Flanders (0.04)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.94)

Add feedback

Non-Monotonic S4F Standpoint Logic (Extended Version with Proofs)

Gorczyca, Piotr, Strass, Hannes

arXiv.org Artificial IntelligenceNov-18-2025

Standpoint logics offer unified modal logic-based formalisms for representing multiple heterogeneous viewpoints. At the same time, many non-monotonic reasoning frameworks can be naturally captured using modal logics - in particular using the modal logic S4F. In this work, we propose a novel formalism called S4F Standpoint Logic, which generalises both S4F and propositional standpoint logic and is therefore capable of expressing multi-viewpoint, non-monotonic semantic commitments. We define its syntax and semantics and analyze its computational complexity, obtaining the result that S4F Standpoint Logic is not computationally harder than its constituent logics, whether in monotonic or non-monotonic form. We also outline mechanisms for credulous and sceptical acceptance and illustrate the framework with an example.

artificial intelligence, logic, logic & formal reasoning, (12 more...)

arXiv.org Artificial Intelligence

2511.10449

Country:

Europe (1.00)
Asia (1.00)
North America > United States > Massachusetts (0.28)
North America > United States > California (0.28)

Genre: Research Report (0.64)

Industry: Health & Medicine (0.93)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (1.00)

Add feedback

The Evolution of Statistical Induction Heads: In-Context Learning Markov Chains

Neural Information Processing SystemsOct-10-2025, 06:23:54 GMT

Work done while visiting Harvard University.

experiment, matrix, transformer, (13 more...)

Neural Information Processing Systems

Country:

North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
North America > United States > Pennsylvania (0.04)
North America > United States > California > Los Angeles County > Long Beach (0.04)
North America > Canada > Quebec > Montreal (0.04)

Genre:

Research Report > Experimental Study (0.93)
Research Report > New Finding (0.92)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.84)

Add feedback

Riddled basin geometry sets fundamental limits to predictability and reproducibility in deep learning

Ly, Andrew, Gong, Pulin

arXiv.org Artificial IntelligenceOct-8-2025

Fundamental limits to predictability are central to our understanding of many physical and computational systems. Here we show that, despite its remarkable capabilities, deep learning exhibits such fundamental limits rooted in the fractal, riddled geometry of its basins of attraction: any initialization that leads to one solution lies arbitrarily close to another that leads to a different one. We derive sufficient conditions for the emergence of riddled basins by analytically linking features widely observed in deep learning, including chaotic learning dynamics and symmetry-induced invariant subspaces, to reveal a general route to riddling in realistic deep networks. The resulting basins of attraction possess an infinitely fine-scale fractal structure characterized by an uncertainty exponent near zero, so that even large increases in the precision of initial conditions yield only marginal gains in outcome predictability. Riddling thus imposes a fundamental limit on the predictability and hence reproducibility of neural network training, providing a unified account of many empirical observations. These results reveal a general organizing principle of deep learning with important implications for optimization and the safe deployment of artificial intelligence.

artificial intelligence, initialization, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2510.05606

Genre: Research Report > New Finding (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Minimal Model Reasoning in Description Logics: Don't Try This at Home!

Di Stefano, Federica, Manière, Quentin, Ortiz, Magdalena, Šimkus, Mantas

arXiv.org Artificial IntelligenceAug-8-2025

Reasoning with minimal models has always been at the core of many knowledge representation techniques, but we still have only a limited understanding of this problem in Description Logics (DLs). Minimization of some selected predicates, letting the remaining predicates vary or be fixed, as proposed in circumscription, has been explored and exhibits high complexity. The case of `pure' minimal models, where the extension of all predicates must be minimal, has remained largely uncharted. We address this problem in popular DLs and obtain surprisingly negative results: concept satisfiability in minimal models is undecidable already for $\mathcal{EL}$. This undecidability also extends to a very restricted fragment of tuple-generating dependencies. To regain decidability, we impose acyclicity conditions on the TBox that bring the worst-case complexity below double exponential time and allow us to establish a connection with the recently studied pointwise circumscription; we also derive results in data complexity. We conclude with a brief excursion to the DL-Lite family, where a positive result was known for DL-Lite$_{\text{core}}$, but our investigation establishes ExpSpace-hardness already for its extension DL-Lite$_{\text{horn}}$.

artificial intelligence, logic & formal reasoning, minimal model, (19 more...)

arXiv.org Artificial Intelligence

2508.0535

Country:

Europe (0.67)
North America > United States (0.28)

Genre: Research Report (0.81)

Industry:

Health & Medicine (0.67)
Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Description Logic (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (0.92)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (0.86)

Add feedback

Forgetting in short and heterogeneous sequences of belief revisions

Liberatore, Paolo

arXiv.org Artificial IntelligenceMay-19-2025

Forgetting a specific belief revision episode may not erase information because the other revisions may provide or entail the same information. Whether it does was proved coNP-hard for sequences of two arbitrary lexicographic revisions or arbitrarily long lexicographic Horn revisions. A polynomial algorithm is presented for the case of two lexicographic Horn revision. Heterogeneous sequences, including revisions other than lexicographic, were proved to belong in Delta2. Their previously proved coNP-hardness is enhanced to Dp-hardness.

artificial intelligence, belief revision, logic & formal reasoning, (20 more...)

arXiv.org Artificial Intelligence

2504.13986

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Netherlands > North Holland > Amsterdam (0.04)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Belief Revision (1.00)

Add feedback

Differential learning kinetics govern the transition from memorization to generalization during in-context learning

Nguyen, Alex, Reddy, Gautam

arXiv.org Artificial IntelligenceDec-12-2024

Transformers exhibit in-context learning (ICL): the ability to use novel information presented in the context without additional weight updates. Recent work shows that ICL emerges when models are trained on a sufficiently diverse set of tasks and the transition from memorization to generalization is sharp with increasing task diversity. One interpretation is that a network's limited capacity to memorize favors generalization. Here, we examine the mechanistic underpinnings of this transition using a small transformer applied to a synthetic ICL task. Using theory and experiment, we show that the sub-circuits that memorize and generalize can be viewed as largely independent. The relative rates at which these sub-circuits learn explains the transition from memorization to generalization, rather than capacity constraints. We uncover a memorization scaling law, which determines the task diversity threshold at which the network generalizes. The theory quantitatively explains a variety of other ICL-related phenomena, including the long-tailed distribution of when ICL is acquired, the bimodal behavior of solutions close to the task diversity threshold, the influence of contextual and data distributional statistics on ICL, and the transient nature of ICL.

icl, sequence, transition, (14 more...)

arXiv.org Artificial Intelligence

2412.00104

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Memory-Based Learning > Rote Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.67)

Add feedback

Disentangling Linear Mode-Connectivity

Altintas, Gul Sena, Bachmann, Gregor, Noci, Lorenzo, Hofmann, Thomas

arXiv.org Artificial IntelligenceDec-15-2023

Linear mode-connectivity (LMC) (or lack thereof) is one of the intriguing characteristics of neural network loss landscapes. While empirically well established, it unfortunately still lacks a proper theoretical understanding. Even worse, although empirical data points are abound, a systematic study of when networks exhibit LMC is largely missing in the literature. In this work we aim to close this gap. We explore how LMC is affected by three factors: (1) architecture (sparsity, weight-sharing), (2) training strategy (optimization setup) as well as (3) the underlying dataset. We place particular emphasis on minimal but non-trivial settings, removing as much unnecessary complexity as possible. We believe that our insights can guide future theoretical works on uncovering the inner workings of LMC.

arxiv, dataset, lmc, (14 more...)

arXiv.org Artificial Intelligence

2312.09832

Country:

Europe > Switzerland > Zürich > Zürich (0.04)
Europe > Latvia > Lubāna Municipality > Lubāna (0.04)

Genre: Research Report > New Finding (0.47)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.70)

Add feedback